Quantum Feedback Control: 
How to use Verification Theorems and Viscosity Solutions to Find Optimal Protocols 
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While feedback control has many applications in quantum systems, finding optimal control proto- 
cols for this task is generally challenging. So-called "verification theorems" and "viscosity solutions" 
provide two useful tools for this purpose: together they give a simple method to check whether any 
given protocol is optimal, and provide a numerical method for finding optimal protocols. While 
treatments of verification theorems usually use sophisticated mathematical language, this is not 
necessary. In this article we give a simple introduction to feedback control in quantum systems, and 
then describe verification theorems and viscosity solutions in simple language. We also illustrate 
their use with a concrete example of current interest. 
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I. INTRODUCTION 

In quantum feedback control (QFC), an observer con- 
tinuously monitors a quantum system [1, 2], and uses the 
information from this measurement, as it is obtained, to 
control the system by continually modifying the Hamil- 
tonian and/or the observable being measured. Over the 
last decade the topic of QFC has generated an increas- 
ing amount of theoretical [3-40] and experimental [41- 
44] work. This is due partly to experimental progress in 
micro- and mesoscopic quantum systems [41-49], partly 
because QFC possesses a wide range of potential appli- 
cations [5, 6, 50-74], and partly because of its connec- 
tion with fundamental questions in quantum mechan- 
ics [11, 35, 74, 75]. 

Unlike in classical feedback control, in QFC the mea- 
surement that is part of the feedback loop affects the 
dynamics of the system [1, 2]. Even so, QFC is actually 
contained within the general framework of classical con- 
trol theory. This is because, even if the measurement in 
the feedback process is continually modified as the system 
evolves, quantum systems are merely specific examples 
of noisy non-linear dynamical systems. As a result, the 
techniques of classical control theory are usually applica- 
ble to quantum systems. However, most of the powerful 
results of control theory apply only to linear systems, 
and are therefore not relevant to the majority of quan- 
tum feedback control problems. An important exception 
are the "verification theorems" [76-80]. These are ap- 
plicable to feedback control in any dynamical system, 
and have two uses. The first is to test whether or not a 
given feedback protocol is the optimal protocol for a given 
task. If one has devised a protocol, by intuitive means 
or otherwise, it is very useful to be able to check if it is 
optimal, especially if the protocol admits an analytic so- 
lution. Knowing that a protocol is optimal is useful, but 
so is knowing that it is not — this tells us that there may 
yet be hidden and unexpected things to learn about the 
given problem. The second use of verification theorems 
is that they provide a systematic numerical procedure for 



finding optimal feedback protocols. 

The literature on verification theorems is not easily 
accessible to most quantum physicists, however, because 
it is written in the jargon of axiomatic probability the- 
ory (filtrations, adapted processes and the like). Further, 
the most widely applicable verification theorem, proved 
in the last few years, requires the use of "superderiva- 
tives" and "viscosity solutions" [81], a subject unfamiliar 
to most theoretical physicists. Our purpose here is to 
explain, in a straightforward manner and without tech- 
nical jargon, how to check if a feedback protocol is opti- 
mal, and how to find optimal protocols numerically. We 
also give a concrete example of the former, involving an 
optimal feedback protocol that we reported recently in 
Ref. [82] , and for which the viscosity verification theorem 
is essential. We note that we were first introduced to veri- 
fication theorems by the article of Wiseman and Boutcn, 
in which they used these theorems to prove the opti- 
mality of a number of quantum rapid-purification proto- 
cols [51, 73]. Since the usefulness of verification theorems 
in QFC is clear, we felt that an accessible article on this 
topic would useful. 

We begin in the following section by introducing the 
subject of quantum feedback control itself. While the 
concepts are simple, it does require stochastic calculus, 
and this is unfamiliar to many physicists. We thus start 
by discussing this aspect of QFC a little. We then present 
the equations of motion for a continuously monitored sys- 
tem, and show how the effect of feedback can be easily 
included. We next show how to quantify the problem of 
optimizing the feedback protocol for a given task. Having 
done this we now know what the optimization problem 
is for any given control task, and we can show how to 
use the "classic" verification theorem to check whether 
a feedback protocol is optimal. This is the subject of 
Sections III and IV. The classic verification theorem is 
sufficient so long as the evolution induced by the feedback 
protocol has continuous first and second derivatives. For 
many protocols the evolution of the system does not sat- 
isfy this condition, however. An example is a protocol 
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which switches abruptly at some time from one Hamilto- 
nian to another. In this case the evolution has continuous 
derivatives everywhere except at the point(s) where the 
protocol switches. To check optimality in this more gen- 
eral case we require a more general verification theorem. 
This theorem, while very similar to the first, requires the 
use of viscosity solutions. In Section V we explain how to 
calculate a superderivative (and subdcrivative) , and how 
to show if a function is a "viscosity solution" of a given 
differential equation. We can then show how to use the 
more general verification theorem to determine the op- 
timality of a feedback protocol that has discontinuities, 
and we do this in Section VI. With this we have achieved 
our main goal. In Section VIII we show how the verifi- 
cation theorems also provide a numerical method to find 
optimal feedback protocols, and in Section IX we show 
how, with almost no modification, the same techniques 
work for time-optimal control problems. Section X con- 
cludes with a brief summary. 



II. QUANTUM FEEDBACK CONTROL 

A. Including Feedback in Quantum Mechanics 

The dynamics of an isolated quantum system is given 
by Schrodinger's equation. If we describe the system us- 
ing a density matrix, p, then this equation is 



P = -{i/h)[H,p\. 



(1) 



If there is an environment that introduces noise into the 
system, then one can often include the effects of this envi- 
ronment by adding the terms of a Lindblad master equa- 
tion to the above Hamiltonian evolution [1]. The equa- 
tion of motion for p becomes 



p^-{i/h)[H,p\+-iV[c]p, 
where we have defined 



V[c]p = 2cVc 



c^cp 



pc)c. 



(2) 



(3) 



Here 7 is a rate constant, and c is an operator that de- 
pends on the coupling of the system to the environment. 
To add to the equation of motion the effect of monitoring 
the system, we simply add another set of terms. Before 
we do so, however, let us explain exactly what we mean 
by monitoring. 

Monitoring a system means that we obtain a continu- 
ous readout of some property of the system. Let us say 
that we are monitoring the position of an object. This 
means that in each tiny time-step dt, we obtain a little 
bit of information about this position, x. Why a tiny 
bit? Why not the precise position? To understand this, 
first recall that all real measurements (classical as well 
as quantum) have some inaccuracy, which means that 
the measurement result is x plus some random number 
(the error). This random number limits the amount of 



information we have about x (that is, it limits the ac- 
curacy to which we can pinpoint x after obtaining the 
result). Now, no physical measurement process can ex- 
tract a finite amount if information in an infinitely short 
time, because the speed of any interaction is necessarily 
finite. Thus in a time interval dt, the amount of infor- 
mation must also be infinitesimal, which means that the 
error must tend to infinity as — > 0. For a classical 
measurement the measurement result in a time interval 
dt is therefore given by 



R = X -\- e, 



(4) 



where x is the true value of x, and e must have a bigger 
variance the smaller dt. It turns out that the quantum 
equivalent of this is [1] 



R={X)+ e. 



(5) 



where X is the operator for the observable x [2] . It turns 
out that if the error is Gaussian, (being by far the most 
common measurement error because of the central limit 
theorem, and the one we will consider here), the variance 
of £ must be proportional to \/ dt (and thus the error is 
proportional to l/^/dt). For further details on the reason 
for this, we refer you to [1, 83]. Because of this the value 
of R becomes, strictly speaking, infinite as dt 00. So 
instead of writing our equations in terms of R, we write 
them instead in terms of Rdt, which we will denote by 
dr. This is a perfectly finite quantity. Se we have 



dr = {X)dt + gdW, 



(6) 



where g is an arbitrary constant determining the amount 
of the noise, and dW is a Gaussian random variable with 
mean zero and variance equal to dt. Thus in each interval 
dt, dW is picked from the probability density 



P{dW) = 



1 



V2TTdi 



-dW^ /[2dt) 



(7) 



Now that we know precisely what we mean by a con- 
tinuous measurement of a physical quantity x; we mean a 
measurement that provides a continuous stream of mea- 
surement results dr{t). Now we need to know how the 
state p changes in each time interval as a result of the 
measurement. In the interval dt, the state p{t) goes to 
p{t) + dp, where [1] 



dp = ~kV[X]p + V2k{Xp + pX - 2{X)p)dW, (8) 

where k = l/(8g^) and as usual {X) = Tt[Xp]. Loosely 
speaking, k gives the rate at which the measurement ex- 
tracts information about x. Here the random increment 
dW is precisely the same increment that appears in the 
measurement record dr. This makes sense: the change 
in the state of the system in the time interval dt depends 
upon the measurement result in that time interval. As a 
result we can also write the above equation for dp directly 
in terms of the measurement record: 

dp = -kV[X]p + 4:k{X p + pX-2{X)p){dr- {X)dt). (9) 
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Normally one would divide dp by dt to get the time 
derivative p, and write the equation of motion for p using 
this derivative. However, dW/dt is rather pathological 
(just as r is above), so when writing differential equa- 
tions involving dW we keep them in terms of differentials. 
Thus the full equation of motion for a noisy system that 
is also continuously monitored is 

dp = -{i/h)[H,p]dt + V[c]pdt- kV[X]pdt 

+ V2k{Xp + pX -2{X)p)dW. (10) 

This equation is referred to as a stochastic equation, be- 
cause it contains a term that fluctuates randomly. 

To be able to manipulate stochastic equations, one 
needs to use the rules of stochastic calculus. This boils 
down entirely to the rule dW'^ = dt. On first sight this is 
a very strange relation, because dW is random, and dt is 
deterministic. However, note that {dW'^) = dt. Because 
of this, if one divides any tiny time interval into many 
smaller sub-intervals, and adds up the many increments 
of dW^ , the result is a deterministic quantity in the limit 
in which there are infinitely many sub-intervals [84] . The 
result is that in the continuum limit it is always true that 
dW^ = dt. For further details regarding the manipula- 
tion and solution of stochastic equations we refer you 
to [1, 2, 83, 84]. 

Now that we have the evolution equation for a mon- 
itored system, we want to include feedback in the dy- 
namics. This is very simple. All we do is specify that 
the observer can change the Hamiltonian of the system, 
H, and also if we wish, the measured observable X, and 
allow H and X to be some function of the measurement 
results. Specifically. H and X at time t arc allowed to be 
any function of the measurement results, dr(t), obtained 
up until that time. We can write H, for example, as 
H{T{t)), with T{t) = /p /(t, t')dr{t') where /(t, t') is an 
arbitrary function of its arguments. Because the density 
matrix at time t completely determines the future be- 
havior of the system (given the system Hamiltonian), all 
optimal feedback protocols can be obtained by making 
H[t) and X{t) functions of p{t) (and possibly the cur- 
rent time, t). The observer obtains p{t) simply by using 
the measurement results to solve Eq.(lO) from the ini- 
tial time up until time t. Thus we choose H ~ H{t, p) 
and X = X{t,p). The equation for the evolution of p, 
including feedback, is then simply 

dp = -{i/h)[H{t,p),p\dt~kV[X{t,p)]pdt 

+V2k[X{t, p)p + pX{t, p) - 2{X{t, p))p]dW 
+V[c]pdt. (11) 

The problem of feedback control is to determine i/(t, p) 
and/or X{t,p) in order to achieve a desired evolution as 
closely as possible in the presence of noise. 



B. The Goal of Control 

In feedback control the goal is usually one of three 
things: to arrive at a desired state at a given time T; to 
produce an evolution that is as close as possible to some 
desired evolution; or to reach a given state as quickly as 
possible. The first two are called "finite-horizon" control, 
and the third "time-optimal" control. We will consider 
the first two of these when introducing the verification 
procedures. These procedures can be applied equally well 
to the third, with minor modification, and we discuss this 
in Section IX. 

The reason that the first two control objectives fall in 
the same category is that the first is merely a special 
case of the second: in the former we require the system 
to be close to a specific pure state at a given "horizon 
time" T; in the latter we require that the system be as 
close as possible to some given (time dependent) pure 
state, \'4){t))., at every time. This desired state is called 
the target state, or the target evolution. (Note that there 
is little point in choosing the target state to be mixed - 
mixing merely represents a lack of knowledge regarding 
the state, and this is not beneficial in a control setting.) 

Now consider the second control objective. To define 
precisely how well a control protocol achieves the objec- 
tive, we need to choose a measure of the distance from 
one quantum state to another to quantify what we mean 
by the system being close to the target state. There are a 
number of measures we can use, such as the fidelity [85], 
the distinguishability [86], or the quantum relative en- 
tropy [87], to name but three. If we choose the fidelity 
then we want our control protocol to minimize 

Fit) = i-{m\pit)\m)- (12) 

Since this is a function of time, to obtain a single quan- 
tity to minimize wc must combine F{t) at all the dif- 
ferent times into a single number. To use verification 
theorems, and indeed the majority of results of opti- 
mal control theory, we must combine the F^s in a sim- 
ple fashion: the function to minimize must be merely a 
weighted sum of the F's at different times. Since time is 
continuous, this means that the function to minimize is 
J = a{t)F(t)dt, for some a{t) and final time T. This 
means that the function to minimize is simply an integral 
over time of a function of the state p at each time. Fortu- 
nately this form is quite reasonable and well-motivated, 
and is rather general. For example, we can weight dif- 
ferent times differently in the integral if some are more 
important that others. If the final time is especially im- 
portant, we can give it extra weight by including it by 
itself in addition to the integral (that is. place a delta 
function weighting at time T). We can also include more 
general functions of the state, since we can place any 
function of the form L{p{t),t) in the integral. For exam- 
ple, if we just want to control the expectation value of 
a particular observable then L can be a function of this 
expectation value. Or we could choose L to be the von 
Neumann entropy of the state, as another example. 
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There is one more crucial thing to add. If we could 
both extract information from the system infinitely fast, 
and apply infinite forces to the system (control Hamil- 
tonians that induce infinitely fast dynamics) then we 
could easily make the system follow any target evolu- 
tion exactly, even in the presence of environmental dis- 
turbance and other noise. Feedback control is only non- 
trivial if the measurement rate and/or feedback forces arc 
constrained, which they always are in real applications. 
There are two ways to place constraints on the control 
inputs and the measurement rate. One is simply to place 
a fixed upper bound on them, and attempt to minimize 
J under these constraints. The alternative is to include 
a function of the measurement rate and feedback Hamil- 
tonian in L. This means that in minimizing J we are 
trying to find the protocol that gives us the best evolu- 
tion, while at the same time keeping the control forces 
as small as possible. The size of the feedback forces, and 
the rate at which we extract information, arc referred to 
as the "costs" of the control protocol. Because one of- 
ten includes these in the functional to minimize, J, this 
functional is usually called the cost. 

Finally, since the system is driven by noise, the actual 
cost (effort) expended by the controller on a given run 
will depend on the specific values that the noise takes 
for that run. These specific values are called the noise 
realization. Since the actual cost will vary from run-to- 
run, it is sensible to have the control protocol minimize 
the average value of J, where the average is over all noise 
realizations. 

To summarize succinctly the above discussion, a feed- 
back control problem is defined by the equation of motion 
of the system we are trying to control, and a cost of the 
form 



J= / L[p{t),t,Ha,{t),X{t)]dt + M{p{T))) , (13) 



that we choose based on our control objective, and that 
wc want to minimize. Here iJfb is the part of the Hamil- 
tonian that gives the forces applied by the controller (the 
"control Hamiltonian" ) . and we have separated out the 
contribution to the cost of the final state, calling this 
M{p{T)). It is useful to do this because in many control 
problems it is, in fact, only the final state that is impor- 
tant, so that one sets i = 0. The angle brackets with 
the subscript "n", (• • • )n, indicate that the value of the 
integral is averaged over all possible noise realizations. 



III. THE CLASSIC VERIFICATION THEOREM 

We now explain how to use the "classic" verification 
theorem of optimal control theory to determine whether 
a given protocol is the optimal one. 



A. First, a few definitions... 

We will now move to a new notation. This is to make 
the presentation easier, and to employ the same notation 
as used by most control theory texts. We write the gen- 
eral dynamical equation for the system we are trying to 
control as 

dx = A(t, X, u(x, t))dt + B{t, X, u(x, t))dW. (14) 

Here the state of the system is given by the vector x, 
and the control inputs to the system are denoted by the 
vector u (which is usually a function of the current state, 
x) [3]. Thus for quantum feedback control, the vector x 
is the vector of the elements of p: 

x^p. (15) 

The vector u is the set of all parameters that we can 
vary. If we write the control Hamiltonian as 

Ha,{t) = pi{t)Hi + p2{t)H2 + ■■■ + p^{t)H, + • • • , (16) 

where the pi are real numbers, then u is the set of these 
Pi, along with a set of numbers that determine the ob- 
servable we choose to measure, X{t). If we write X as 

X = UDU\ (17) 

where U is unitary and D is diagonal, then X is deter- 
mined in the most general case by the diagonal elements 
of D, and the N{N + l)/2 angles 9k that parametrize an 
A^-dimensional unitary [88] . Thus 

u^{p,,D,9k}. (18) 

The vector A is any vector- valued function of x, u and 
t, and gives the deterministic dynamics of the system. It 
is determined by the Hamiltonian H and the noise op- 
erator c. (Yes - the noise on the system actually gives 
deterministic motion, because we are considering the evo- 
lution of the density matrix. It is only the measurement 
that makes the evolution stochastic.) The matrix- valued 
function B gives the stochastic part of the evolution. It is 
determined by the measured observable X{t). The vector 
dW is a vector of mutually independent Wiener noises. 
As usual we will take the initial time to be t = 0, denote 
the initial state of the system as x(0) = Xq, and we will 
call the final (horizon) time t = T. 

The function u{x{t),t) completely defines the control 
protocol. It tells us what control inputs (what control 
Hamiltonian) to apply at any time t, for every possible 
state of the system at that time. Thus u{x{t),t) is the 
control protocol. 

The control inputs, u, are usually subject to limits. In 
general these limits can be defined by any closed region 
in the vector space in which u lives. The simplest set of 
conditions would be 

Uiow < U(x,t) < Uhigh- (19) 
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Note that the region (and thus uiow and Uhigh) is not a 
function oft or x; there is only one region that bounds the 
values control inputs for the duration of the control. The 
procedure we will describe is general enough, however, to 
handle a limiting region that varies with time. 

As described above, the control objective is to mini- 
mize a cost, J. Using our new notation, this cost is 



J = 



L(x, u(x, s), s)ds + A/(x(T)) 



(20) 



where L and M arc functions that we choose. For the 
verification procedure, we have to introduce one more 
quantity, called the cost function, C (x, t) . This is defined 
as the value of J over the interval [t,T], given that the 
system is at state x at time t: 



C(x,t) 



Lds + M(x(T)) 



(21) 



Now we come to the function G. This is 



G = -iTr 



Bt(t,x,v)^S(t,x,v) 



. dC 
-A - — - L t,x,v 
ax 



(24) 



Note that A and B are respectively the vector and ma- 
trix that appear in the equation of motion for the sys- 
tem, dC/dx is the vector of first derivatives of C, and 
d^C/d'x.^ is the matrix of second derivatives. To check 
that C satisfies the differential equation given by Eq.(23), 
the only potentially tricky thing is determining what the 
maximum of G is for each t and x — however, this can 
be fairly easy, depending on the system. 

4. Check that the u(x, t) for your protocol maximizes 
G (if this maximum is unique, then this also means that 

U(x,t) = Vmax(x, 0). 



Note that G{x,t) is the average cost that will have to 
be paid over the time remaining, given we have reached 
time t. For this reason it is often called the "cost-to-go" . 



B. The verification procedure 

To determine whether a given control protocol, u = 
f (x, t) is optimal, one performs the following four steps: 

1. Integrate the equations of motion of the system to 
calculate the cost function, C{x{t),t), for this protocol. 

2. Check that G satisfies two continuity conditions. 
These are that 



dC , d^C 



(22) 



are continuous. Here d^C jdy? denotes the matrix of sec- 
ond derivatives of C . 

3. Determine whether or not C(x, s) satisfies the fol- 
lowing differential equation (called the Hamilton- Jacobi- 
Bellman (HJB) equation): 



dC_ 
'dt 



max 



G X, V, -— 



dC &^G 



9x ' 9x2 



(23) 



with the final condition G(x,T) ^ M(x[T)) [4]. 

We will give the function G below. Before we do so, we 
note that the maximum is taken over the allowed values 
of a vector v. The allowed values of v arc those of u: v 
can only take values in the region that bounds the values 
of u(x, t). Note that because G is a function of t and x, 
one must maximize G separately at each time and at each 
value of x. Because of this, the value of v that maximizes 
G will in general be different at different values of t and 
x. Thus the v that maximizes G is in general a function 
of t and x: Vi„ax(x,t). 



IV. USING THE VERIFICATION THEOREM: 
AN EXAMPLE 

We now apply the verification theorem to a concrete 
example. This example was solved in Ref. [82], but with- 
out the details of the analysis. The problem involves 
feedback control of a 3-state quantum system (otherwise 
known as a qutrit). The purpose is to prepare the system 
in a given pure target state at time T, but it is only the 
measurement strength fc that is bounded; it is assumed 
the feedback Hamiltonian available to the controller gen- 
erates evolution that is much faster than the measure- 
ment strength, and the noise in the system. It is also 
assumed that the controller has the ability to implement 
any Hamiltonian for the qutrit. 

Because of the strong feedback Hamiltonian, the prob- 
lem becomes one of maximizing the largest eigenvalue of 
the density matrix. The reason for this is that, at any 
desired time, the controller can quickly evolve the system 
(that is, apply a unitary operator) so that the eigenvec- 
tor corresponding to the largest eigenvalue is equal to 
the target state. For a given density matrix, this unitary 
operation maximizes the probability that the system is 
in the target state, and is thus the optimal application 
of the Hamiltonian part of the feedback. In this case the 
probability that the system is in the target state is equal 
to the largest eigenvalue of the density matrix. The feed- 
back control problem in this case is therefore to choose 
the measured observable, X(t\ as the system evolves, 
so as to maximize the largest eigenvalue of the density 
matrix at time T . 

Since the feedback Hamiltonian is very large, it can 
also be used to continually rotate the eigenvectors of the 
density matrix so the density matrix eigenbasis remains 
constant with time. The result of this is that, when we 
express X in terms of this eigenbasis, the equations of 
motion for the eigenvalues of the density matrix are de- 
coupled from those of the eigenvectors, and so we only 
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need to consider the motion of the eigenvalues to find the 
optimal control protocol. 

In addition to the restriction on the measurement 
strength, we restrict the observable X to having equi- 
spaced eigenvalues. Since the measurement part of the 
master equation is invariant under the transformation 
X —f al, where a is a real number. X can be taken to 
be traceless without loss of generality. As a result, the 
measured observable is of the form 



X 




(25) 



where U is any unitary. 

Before solving the problem, we specialize to the 
"regime of good control" . This is the regime in which 
the probability that the system is in the target state, P, 
is close to unity [35, 82]. This means that P = 1 — A, 
where A ^ 1. One can then simplify the dynamics of the 
density matrix by expanding to first order in A. Recall 
that the goal of the feedback in this case is to optimize 
the largest value of the density matrix, and this problem 
is completely defined by the dynamics of the eigenvalues 
alone. 

We now denote the largest eigenvalue by Aq, the second 
largest by Ai , and the smallest by A2. Since ^ ■ Xi = 1, we 
have only two independent dynamical variables, Ai and 
A2. In the regime of good control, under a measurement 
of X, the equations of motion are [82] 



dAi 



dAa 



-V8{Xoo - Xn)XidW 

|2 '^1-^2 



-^^20! A2 



\Xi2 



Al — A2 

-V8{XoQ - X22)X2dW, 



dt 



dt 



(26) 



(27) 



where the matrix elements of X are those in the eigenba- 
sis of the density matrix at the current time. Note that 
the task is to choose X{t) so as to maximize Ao(T), which 
means minimizing A(r) = Ai(T) + A2(T). 

Our candidate for the optimal control protocol, sug- 
gested in [82], is to choose X at each time so that, in the 
eigenbasis of the density matrix, it is 



a 
X = \ a 




(28) 



This is suggested as the optimal protocol only up until the 
point at which Ai has been reduced to the value of A2 . We 
now use the procedure described in the previous section 
to determine whether or not this protocol is optimal. (If 
you really want to burn this procedure into your brain, 
you can stop reading now and try it.) 

To begin we need to obtain the cost function. For 
this we need the time evolution of the thing we want 



to minimize (the cost). A, under the suggested protocol. 
Under the protocol, the only non-zero elements of X are 
Xqi = XiQ = a, and the equations of motion become 



dXi = —Sa^Xidt, 
dX2 = 0. 



(29) 
(30) 



These are easy to solve, with the result that A(t) = 
Ai(0)e"^°^* + A2(0). Recall that the cost function, 
C{Xi, A2, t), is the value of A at the final time, T, given 
that we started with the values Ai and A2 at time t. So 
this is 



C(Ai,A2,t)=Aie-«"'(^-*)+A2 



(31) 



Note that all the derivatives of C are continuous, so the 
continuity conditions for the verification theorem are sat- 
isfied. 

Now we have to calculate the function G. The ingre- 
dients for this are the equations of motion, Eqs. (26) and 
(27), and the first and second derivatives of C with re- 
spect to Al and A2 . Calculating these derivatives we find 
that 



G{X) = S^AilXiol^e-^-^'^^-'^-f A2IX20I 
Al A2 



+8 



Al — A2 



i^2iini-e- 



8a^(T-t) 



).(32) 



We must now find the maximum of G{X) over all X, 
under the constraint Eq.(25). Thus we must maximize 
G over all unitaries U. It is important to note that the 
a that appears in G{X) above it not part of the opti- 
mization; this has already been fixed by the protocol. To 
perform the maximization, we note first that we are only 
considering times t such that Aie"®*^ * > A2. Because of 
this, we can show that G will be maximized if we maxi- 
mize 



F{X) = 8A2 [v\Xiof + \X20f + el^: 
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(33) 



where 77 and ^ are constants satisfying jy > 1 > ^. 
(Specifically 7] = Aie-8"'(^-*)/A2, and ^ = ?/[e«'^'(^-*) - 
l]/[Ai/A2 — 1].) We do this maximization over U numer- 
ically using Matlab's fminsearch function. (This func- 
tion uses the Nelder-Mead direct search algorithm.) This 
shows that the maximum is obtained when X is given by 
Eq.(28). Substituting this into G(X) we have 



maxG(X) = Ai^ 

u 



j2g-8a^(T-t)_ 



Calculating the time derivative of C we have 



dC 
'dt 



XiSa^e 



-8a^(T-t) 



(34) 



(35) 



Eqs. (34) and (35) are equal. Thus the cost function sat- 
isfies the Hamilton- Jacobi-Bellman equation (Eq.(23)), 
and so the protocol is optimal. 
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Recall that we have only tested the protocol up until 
the point at which the lowest two eigenvalues equalize. 
If t is the starting time, and T the final time, then this 
means that 



T <t 



ln(Ai/A2) 
8a2 ■ 



(36) 



V. VISCOSITY SOLUTIONS 

The purpose of "viscosity solutions" (a branch of the 
theory of differential equations) is to allow one to define 
a continuous function to be a solution of a second or- 
der differential equation, even though the function does 
not have well-defined derivatives. The reason that this 
subject is useful to us here, is that optimal feedback pro- 
tocols often produce cost functions (C(Ai,A2,T) in the 
previous section) that are not differentiable at some (usu- 
ally small, finite) set of points. It turns out that such 
cost functions are viscosity solutions of the HJB equa- 
tion, and as a result one can check their optimality by 
using a verification theorem that is very similar to the 
classic verification theorem in the previous section. So it 
turns out yet again that an initially rather odd-seeming 
branch of mathematics has a direct application. 

To use the more general verification theorem we first 
have to know how to show if something is a viscosity 
solution to a given DE. To do this we need two new def- 
initions, that of a superderivative (also called a superjet) 
and a subderivative (or subjet). To begin with, note that 
if a function /(y) is twice-differentiable at point yo, then 

/(y) - /(yo) = p • (y - yo) + (37) 



to second order in y — yo, where 

df 



dy 



(38) 



is the vector of first derivatives, and Q is the matrix of 
second derivatives: 



(39) 



We now define the superderivative, J~''(yo) of a function 
at the point yo as the set of all vectors p and matrices 
Q such that 



/(y) - /(yo) < p • (y - yo) 



(y-yo)'''Q(y-yo) 



(40) 



to second order in y — yo. The subderivative, J~(yo) is 
similarly defined by replacing the < with a >. In feed- 
back control, the functions one deals with are usually 
twice-differentiable everywhere except at a small number 
of points. So as an example, let us calculate the super- 
and subderivatives of the function 



/(y) 
fiy) 



f-iy) 
f+iy) 



In y. 



0<y <1 
y>i, 



(41) 
(42) 



at the point y = 1 (where it is not differentiable). To do 
this we first expand f± about the point y = 1, using the 
fist two terms of the Taylor series. This gives 



f-iy) 
f+iy) 



/-(I) 
/+(i) 



^y-i^y)', 



(43) 
(44) 



to second order in Ay = y — 1. So for the real numbers 
p and Q to be in the superderivative set, J^(l), we need 
them to satisfy the following two equations simultane- 
ously: 



Ay - Ay2 
2Ay2 



pAy 
pAy 



Q(Ay)', 

Q(Ay)^ 



Ay <0, 
Ay > 0. 



(45) 
(46) 



Note that the second order terms are irrelevant unless 
the first order terms on each side are equal. Examining 
equation Eq.(46) we see that the condition is satisfied so 
long as p > 0, or p = and Q > 2. Examining equation 
Eq.(45), the condition is satisfied when p < 1 (remember 
that Ay is negative for Eq.(45)), or p = 1 and Q > — 1. 
Since (45) and (46) must be satisfied simultaneously, the 
superderivative is the set 



J+(l) 




cx), oo) 



(47) 



For the real numbers p and Q to be in the subderivative, 
J^(l), they must satisfy 

Ay-(Ay)2 > pAy + Q(Ay)2, Ay < (48) 
2(Ay)2 > pAy + Q(Ay)2, Ay > 0. (49) 

From Eq.(48) we find that p> 1, and from Eq.(49) that 
p <0. Since these cannot both be true at once, J^(l) is 
the empty set: 



J-H) 



(50) 



Now that we know what superderivatives and sub- 
derivatives are ,we can define a viscosity solution to a 
differential equation. We first write our differential equa- 
tion in the form 



(51) 



A continuous function /(y) is a viscosity solution of this 
PDE if it is true that 

FiyJiy),P,Q) > 0, whenever (p,Q)GJ+(y) 
FiyJiy),p,Q) < 0, whenever (p,Q)eJ-(y). 

If the function / is a viscosity solution of a PDE at a given 
point, and its derivatives exist at this point, then it is a 
solution in the normal sense. Therefore, to determine 
if a function is a viscosity solution to a PDE, one first 
checks that it is a solution to the PDE in all regions where 
its first two derivatives are defined. Then one calculates 
the super- and subderivatives of / at those points where 
its derivatives are not defined, and checks that / is a 
viscosity solution at those points. 
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VI. THE ENHANCED VERIFICATION 
THEOREM 

The more general verification theorem, due to Zhou 
and collaborators [79, 80], states that u(x,t) is an opti- 
mal feedback protocol so long as the following conditions 
are satisfied by the cost function, C(x(t),i) [5]: 

1. The cost function must be a viscosity solution of the 
HJB equation, and it must be true that C(x(r),T) — 
M (T) . To determine whether these conditions are satis- 
fied, one first writes the HJB equation in the form 



dC 

dt 



G X, V, — 



dC d^C 



0. (52) 



If we set the vector y in Eq.(51) to y = (t, xi, . . . , xn), 
then the above HJB equation is exactly of the form given 
in Eq.(51). Because the HJB equation does not depend 
on any second derivatives that include t, to determine 
whether C is a viscosity solution, we do not need to con- 
sider these second derivatives when calculating the super- 
and subderivatives. Thus, for the purposes of solving the 
HJB equation, the superderivative at Xg and time Iq, 
J+(xo,to)j is the set of values of {q,p,Q) such that 

C(x, t) - C(xo, to) < gAi + p • Ax + Ax^QAx, (53) 

where At ~ t — to and Ax = x — xq . Naturally the sub- 
derivative is defined in the same way, with the inequality 
reversed. Armed with these super- and subderivatives 
one uses the procedure described in the previous section 
to determine if C is a viscosity solution to the HJB equa- 
tion. 

2. Check that u{t) maximizes G. 

3. For each t there must be a q{t), p{t) and Q{t), such 
that: 



and 



{q{t),p{t),Qit))eJ+{xit),t) (54) 



q{t) = G (i(i), u(x(t), t), p(t), Q(t)) , (55) 



where x(t) is the evolution of the system under our feed- 
back protocol u(x,t). The reason that we no longer 
need the maximization in this equation, is because C is 
a viscosity solution to the HJB equation (Eq.(52)), and 
u{t) = u(x(i),t) maximizes G. This means that setting 
X = x(t) and v = u(x(t),t) already maximizes G. 



VII. USING THE ENHANCED VERIFICATION 
THEOREM: AN EXAMPLE 

For this example we consider the same control prob- 
lem as the previous example (Sec. IV), but we consider 
the problem for all evolution times, not merely up un- 
til the two smallest eigenvalues have equalized. From 
this point onwards, the protocol suggested in Ref. [82] 



involves rapidly switching the measured observable be- 
tween X, given by Eq.(28), and the observable 



0a 
^2 = ( 
a 



(56) 



This is equivalent to measuring X while rapidly switching 
the eigenvectors of the eigenvalues Ai and A2. In the 
limit of fast switching, the equations of motion of both 
eigenvalues are identical, and given by 



A,; 



1.2. 



(57) 



Starting at time t, and evolving the eigenvalues us- 
ing the original protocol up until time t + t, where 
T = ln(Ai/A2)/(8a2), and then using the second protocol 
from that time until time T, we find that, for T > t + t, 
the cost function is 



C-(Ai,A2,t) = 2v/A^e"4'^ 



(58) 



For times T < t + t, the cost function is still given by 
Eq.(31), and we will call this C"*". 

The cost function is continuous everywhere, but is no 
longer differentiable at t = T — t. Thus we cannot use the 
classic verification theorem on the protocol, but must use 
the enhanced verification theorem. This requires that we 
show that C is a solution to the HJB equation on the 
intervals in which it has continuous derivatives, and that 
it is a viscosity solution at the point t = T—t. We already 
know that it satisfies the HJB equation on the interval 
t E [T — T, T] (that is, when the evolution time is less 
than t). So we examine next the interval t G [0,T — r]. 
To calculate G we need the first and second derivatives 
of with respect to Ai and A2. These are 



(59) 
(60) 
(61) 



dC- 


G- 


dXj 


~ 2A/ 


d^G- 


G- 


dX] 




d^c- 


C~ 


8X18X2 


4A1A2 



The expression wc get for G this time is 

-4(|Xio|" + |X20 



G{X) = (Xii - X22) 

\2 , w |2 



x2ir)G- 

We therefore need to find the U that maximizes 



(62) 



F{X) = {Xn - X22)^ + 4. (\X 



101 



\X 



20 1 



IX 
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Once again we perform this optimization numerically us- 
ing Matlab's fminsearch. It turns out that there are many 
unitaries that maximize F{X), and the unitaries that 
give the X's defined by Eqs. (28) and (56) both do so. 
So we have 



maxG(X) 

u 



Aa'G- 



(63) 



9 



This is exactly the time derivative of C , so the cost 
function does satisfy the HJB equation on the interval 

[T-r, T]. 

Our final task is to determine whether the cost function 
is a viscosity solution to the HJB equation dX t = T — t. 
To do this we would usually proceed to calculate the su- 
per and subderivatives, which would involve expanding 
C"*" and C~ in their Taylor series as describe in Sec. V. 
However, it turns out that in this case we can take a 
shortcut. First we note that the protocol involves switch- 
ing between two measurement operators, so the cost func- 
tion must be a solution to the HJB equations for both. 
The two HJB equations are 

f = 8|X,.PA.|^l, (64) 

f = (65) 

Because these do not contain the second derivatives, we 
do not need to calculate the Q part of the super and 
subderivatives to determine if C is a viscosity solution. 
We only need the first order parts, q and p. But since 
the first derivatives of C are continuous at t = T — r, q 
and p are exactly these derivatives [6] . So all we need to 
do is to substitute C"*" (or C~) into Eqs. (64) and (65), 
and check that it is a solution to both. Indeed it is, and 
thus C is a viscosity solution of the HJB equation, and 
we can conclude that the suggested protocol is optimal. 



by searching over the allowed space of controls v for the 
V that maximizes G. Note that if v is time-dependent, 
then we will have to search over all functions v for one 
that maximizes G at every time t. This can certainly be a 
challenging task - nevertheless, it does provide a system- 
atic procedure. Further, the problem simplifies in the 
following way: one can perform the search by starting 
close to the final time, and stepping backwards. That 
is, solving for the optimal v in the small time interval 
[T — At,T], for all states x at the start of this interval. 
Then, with v(a;,t) for that interval fixed at the optimal 
result obtained, solving for the interval [T — 2 At, T — A], 
and so-on moving backwards. Because of the form of the 
cost, Eq.(13), this backwards-in-time procedure will find 
a globally optimal v for the whole interval. (This point is 
discussed in most control texts that include the Bellman 
equation - see e.g. [89, 90].) 

If V is not a function of time, but only of the state, x, 
then the task is significantly easier, since not only is the 
search space reduced, but it usually means that if G is 
maximized at a single time, it is maximized at all times, 
so we need only evaluate G at a single time. If one can 
obtain an analytic solution to the equations of motion for 
every v, then one can obtain an analytic expression for 
G in terms of v. In this case numerical minimization is 
likely to be easy. 

IX. AN ALTERNATIVE FORM FOR THE 
COST: TIME-OPTIMAL CONTROL 



VIII. FINDING OPTIMAL PROTOCOLS 

The two verification theorems we have described above 
also provide a method to search for optimal protocols. 
To see how this works we return to the Hamilton- Jacobi- 
Bellman equation, which is 



dC 
dt 



G x,v, — 



dC a^G 



ax ' 5x2 



(66) 



Recall that G(x, t) is the total average cost over the re- 
maining time interval, [t, T] , given that the system has 
state X at time t. The elements of the vector v, in gen- 
eral being functions of the state x and time t, are the 
parameters that give the control protocol. The evolution 
of the system, x(t), and the cost function G(x, i), are 
determined by the choice of v. If the pair v and G solve 
the HJB equation when substituted into it, then v is an 
optimal protocol. 

The fact that allows us to use the HJB equation to 
search for optimal protocols, is that every pair of v and 
G satisfies the equation 



— - G (^x V — — 

dt \ ' ' (?x ' 9x2 



(67) 



However, G will only be maximized if v is an optimal pro- 
tocol. This means that wc can find an optimal protocol 



So far we have taken the cost to be a function of the dy- 
namical variables, integrated over time, including a sep- 
arate contribution at the final time. There is another 
useful form for the cost function that allows us to use 
the exactly the same verification procedures. This alter- 
native cost is what we use if we want to minimize the time 
taken to reach a particular event. This kind of control 
problem is called time- optimal control. In this case we de- 
fine a function /i(x(t), t) of the dynamical variables (and 
perhaps of time), and the goal is to minimize the time 
taken for h to cross a fixed value he, called the threshold. 
The cost is then defined as the expectation value of the 
time remaining before h crosses the threshold. Every- 
thing else is the same as before, the only change is this 
form of the cost. As before, the cost function, G(x, t), 
is the average value of the cost from time t until we fin- 
ish, given that the state at time t is x. That means that 
G(x, t) is the average value of time it will take to cross 
the threshold, given that the current time is t and cur- 
rent state is x. Thus if we define Xc as the state (or set 
of states) for which h ~ he, then G(xc,t) = 0. 

The only change in the verification theorem is that the 
HJB equation changes a little. It is now [76] 



dC 
'dt 



= max 



G X, V, -— 



dC d^C 



9x ' 9x2 



1, 



(68) 



where G is the same as before, except that it no longer 
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contains the function L: 



G ^ — Tr 
2 



Bt(^,x,v) — B(t,x,v) 



ox. 



(69) 



Note that the HJB equation does not contain either the 
function h{x{t),t), or the threshold value he [7]. These 
have already played their role by determining C. 



X. CONCLUSION 

We have shown how to use verification theorems to 
determine whether a given feedback protocol is optimal. 
The procedure is quite straightforward, and involves: 1. 
solving the equations of motion to obtain the cost func- 
tion, 2. checking two simple continuity conditions on the 
cost function, 3. checking that the cost function is a so- 
lution of the HJB equation, and 4. checking that the pro- 
tocol maximizes the RHS of the HJB equation. 

If the cost function does not satisfy the continuity con- 
ditions (step 2 above), then there is an enhanced verifi- 
cation procedure that can be used. This is essentially the 
same as the previous procedure, but one checks instead 
that the cost function is a viscosity solution of the HJB 
equation. 

We have also described how the HJB equation can be 
used, at least in principle, to find optimal protocols. For 
non-trivial problems this will usually, but not always, 
require a numerical implementation, and if so is likely to 
be numerically intensive. What we have not yet done is 
to show how the HJB equation is derived. This is actually 
quite straightforward, and we give this derivation in the 
Appendix. 



XI. APPENDIX 

A. The Hamilton-Jacobi-Bellman Equation 

To begin with we define the value function, ^(x, t), as 
the minimum of the cost function over all possible control 
protocols: 



F(x(t),t) =minCu(x(i),t), 



(70) 



where we have added the subscript u to the cost function 
to make explicit the fact that it depends on the choice 
of the control, u{x{t),t). Recall that the cost function, 
Cu(x(t),t), is the average value of the cost that will be 
incurred from time t until the final time. (The cost func- 
tion is defined in Section HI). The starting point for the 
HJB equation is the fact that the value function satisfies 
the recursion relation 



l/(x(t),t) 




L ds + V{x{t'),t') 



(71) 



where L = I/(u[x(s), s], x(s), s) is the instantaneous cost 
incurred at time s. This recursion relation follows im- 
mediately from two facts. The first is that any protocol 
that gives the optimal control from time to T, must 
also give the optimal control from any time t' > to T. 
Recall that the optimal protocol tells us what control to 
apply at every time t, for every state x(t) that the system 
could be in at that time. Thus if the protocol u mini- 
mizes Cu(x(t),<), then it also minimizes Cu(x(t'), <') for 
t' > t. The second fact is that the total cost is merely a 
sum (integral) over the separate costs L{t) at each time 
t. Because of this the total average cost incurred from 
time t onwards is the sum of the cost incurred from time 
t to t' , and the cost incurred from t' onwards. That is 



Cu(x(0,t) = ( / Lds + Cu(x(i'),0 



(72) 



Adding to this the fact that the optimal protocol u min- 
imizes both Cu(x(i),t) and Cu(x(t'), i'), we obtain the 
recursion relation for V (Eq.(72)). 

The HJB equation can be extracted from Eq.(72) by 
setting t' = t + dt and applying Ito's rule, dW^ = dt. 
Note that this means we must keep all the differentials 
up to the second order. To begin with we have 



/ <.t+dt \ 

F(x,f)=min( / Lds) {V {yi{t + dt) , t + dt)) ^ . 

u{x,t) \Jt I 

(73) 

We now substitute into this the Taylor expansion for 
V{:x. + dx, t + dt), being 



dV dV 

V(yi + dx,t + dt) = V(t,x) + dt— + dx- — 

at ax 

, t d'-V ^ 
+dx^ • —-5- • dx. 



and the equation of motion for the system, (Eq.(14)), 
being 

x+dx = x+A{t, X, u(x, t))dt+B{t, x, u(x, t))dW. (74) 
The result is the HJB equation 



dV_ 
'dt 



max < Tr 

u(x,t) I 2 



B\t,x, u)—^B{t,x, u) 



d. 



-A • 1^ - L{t,x, u] 
ox 



(75) 



Notice that in the last step we used the fact that 
— min(/) = max(— /) for any function /. A rigorous 
derivation of the HJB equation can be found in many 
stochastic control textbooks, an example being Ref. [76]. 
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